Presenting Data and Information by Edward Tufte

Metadata

  • Started:2021-10-15
  • Instructor: Edward Tufte
  • What: Data Visualisation workshop by Edward Tufte
  • Status: Currently ongoing

Part 1

Clutter, confusion, and overload are not inherit to information. They are failures of design.

"The information is the interface"

The course kicks off with a lullaby, MIDI notes charted visually. A project management chart for a lullaby by Chopin. Can hear a note better by tracking it visually.

"Making and consuming presentations is an intellectual and moral activity. It's all about your content and credibility."

Spectatorship is an intellectual and moral activity that must hold the presenter responsible and judge the content.

Presenting data

"Excellence in statistical graphics consists of complex ideas communicated with clarity, precision, and efficiency." - Edward Tufte, B-The Visual Display of Quantitative Information

Good data visualisations should:

  • Show the data.
  • Use a well-suited format and design.
  • Induce viewers to consider the content rather than the methodology, design, or other trappings. Avoid 'chart junk' and content-free decoration.
  • Not distort the data.
  • Be information dense.
  • Make large data sets understandable
  • Reveal the data at different levels of detail-- from a broad overview to fine details-- at an accessible complexity of detail.
  • Serve a clear purpose with a narrative quality.
  • Work closely with the data and verbal descriptions of the data, using words, numbers, and drawings together.
  • Use an appropriate balance, proportion, and sense of relevant scale.

"Graphical elegance is often found in simplicity of design and complexity of data."

"Graphics reveal data"

Insights and patterns that are hidden in raw data can become clear when displayed graphically. However, "statistical graphics, just like statistical calculations, are only as good as what goes into them." Put bad data in, reveal silly 'trends'.

The choice of design

Basic structures for conveying data:

  • the sentence
  • the table
  • the graphic

A sentence, while interesting to think about as an essential unit of data communication, is not very helpful for more than one unit of information as it prevents easy comparisons within the data. That's where tables come in (hello Resources for designing data tables). But beware The perils of pie charts. "The only worse design than pie chart is several of them."

What story does the data reveal? "There are nearly always better sequences than alphabetical."

Simplicity ≠ Clarity, Information dense ≠ complex

From B-Envisioning Information

"Clutter and confusion are failures of design, not attributes of information."

Leverage visual comparison (which humans are great at), not visual memory (which humans are not so great at). [[!Context switch]]ing undermines information exchange and happens when users are forced to remember things across views. Short-term memory is occupied with the accessing and memory rather than the analysis of data.

Information dense displays gives control of information to viewers rather than editors or designers. It requires an active rather than passive viewer. This does not mean cluttered or confusing presentations, for the quantity of detail is a separate issue from the difficulty of reading.

"It is not how much empty space there is, but rather how it is used. It is not how much information there is, but rather how effectively it is arranged."

In Interaction of Color, [[!Josef Albers]] talks about how stripping the detail out of typography actually makes words harder to read (see [[!Serif vs sans-serif]]) and the same principle can be true for data visualisation. Simpleness is an aesthetic, not a strategy or guide to clarity, and can result in useless or uninteresting data.

"It must embody the difficult unity of inclusion rather than the easy unity of exclusion... Where simplicity cannot work, simpleness results. Blatant simplification means bland architecture. [[!Less is a bore.]]" - Robert Venturi, Complexity and Contradiction in Architecture

Tufte points out that the world we seek to understand is complex and intricate. It is only right we portray it with appropriate intricacy and detail. "God is in the details" according to Mies van der Rohe and that sums up the central thesis of Tufte in this portion. Disentangling the presentation and aesthetic of the information presented from its inherit complexity and rich content. A good design should be capable of presenting extremely complex, information dense content in a clear, understandable, and enlightening fashion.

Part 2

Google maps is possibly the most viewed visualisation in human history and it has layers upon layers of data encoded, in a readable format.

Design for a diversity of data and viewers

"When you show people data, your job is not to dumb things down, it's to make everyone who looks at it smarter."

Tufte starts by diving into a seemingly simple website for the national weather service and how they present forecast data. One visualisation with different layers of complexity and detail can serve different users and enlighten them all without over-simplifying for one or the other. Allow people to 'edit' and choose what to look at. We as humans already do this all the time.

Do whatever it takes to convey information. Don't segregate by the medium or method of communication. Words, pictures, etc. they are all information at the end of the day.

Charts encode information

The issue with pies and bar charts is that the data is encoded into an area or colour or both. This means viewers have to decode it before they get to the data. And that code can vary from chart to chart. Sometimes the best way to display information is simply to show it, not encode it. Start with words and numbers. This can also help support different levels of exactitude for different viewers.

Eliminate every impediment and noise. Even a colon makes a difference when millions of people extract data from a display every day.

Be efficient in your display. No one has ever wished meetings longer. For presentations to a content audience, keep your formats straightforward and the content rich. Viewers should spend their time understanding the content, not decoding the design. Conventional does not mean universal, but conventional to your audience.

Consistency is not everything

Always enquire as to the motivations behind consistency. Sometimes consistency is good, other times it is 'because you should' or a brand marketing ploy that requires much effort with little impact.

Sparklines: word-sized graphics

"There is a tremendous amount of information, even in a single letter."

Graphics should have at least the resolution of type, as that is what people can read.

The most common data display is a noun with a number, with their relevant context giving meaning to a single number. Ex: glucose 6.6. Pairing this with a small sparkline (trend line chart), can encode additional information in a very small space. Think of what you see in stock listings, spreadsheet cells, etc.

Consider the task of the viewer. For clinical analysis, is the task to detect and assess deviations outside the normal limits? Emphasise that. Perhaps a band which not only helps dampen things within the normal limits, reducing noise, it highlights those outside.

[[!Recency bias]] gives too much weight to the most recent data coming in. A.k.a panic. Providing proper context and more data (highs, lows, historical, etc) can combat this.

Use direct labels where possible, not a key. Colours can correspond to each other without the need to explain.

Science shows the data, reports uncertainty

If you know something well, you should be able to explain it in ordinary language. If you can't or rely on jargon, you probably don't fully understand it (see the [[!Feynman technique]]).

The truth depends on the correspondence between the model and the data. The danger of hiding the data is you can fit any model to the data and cherrypick or oversimplify. In other words, you can lie.

Data sins

  • Truncation: only show a small sample of data, especially if it fits your forecast or point. You want to see more horizontal points going out because that helps provide the context of the data.
  • Binning: Condense data into a single number or chart element. Un-binning reveals deeper details and context around that data.

Novel findings and fiction

Don't trust truncated and summarised data. Or the people showing it. The defence is that it would clutter things up. But that means someone thought it was cluttered. Someone edited it.

The Phillips Curve is an example of an economic model which did not hold up but seemed perfect on a limited dataset. Fresh data remodels models. "Their argument is not even wrong, it can't be tested by evidence."

Models can create imaginary thresholds and/or plateaus not present in the data. You can try hundreds of models on a dataset and cherrypick one. Follow the money, get to the original data.

Beware self congratulation: 'significantly' (there are different types of significance. Statistical? Substance? What's the data?), 'novel', n-number showing off, etc.

Increase resolution, increase the rate of information transfer

The human eye-brain system is incredibly powerful and processes huge amounts of data. Use it.

Increased resolution means not having one thing after another ('the deck', controlled by an editor) but being able to see more all at once (controlled by user, everyone can scan and choose differently). It's the same in presentations. The only thing worse than someone reading aloud from a bullet point list is when the bullets are revealed one by one. The rate of information transfer is dropping to zero.

Decoding & the Braun corporation dishwasher loading manual

Encoded chart data leaves users searching for how to decode it rather than taking in the data. If possible, don't separate the labels from the data. The less the data is encoded, and so the less the viewer has to decode, the better. People have not come to learn codes. They've come to see the data.

Use direct labelling. Why have people memorise colours and jump between chart and a key? Especially for accessibility and colour blindness.

Analytical thinking principles

  • Make comparisons (also a fundamental task of data analysis).
  • Explain, not just report. Cause and effect, not just correlation.
  • Investigate the credibility of the source of the data.

[[!Analytical thinking]] helps us reason about the relationship between information and conclusions. It is content independent. Looking for causality, not necessarily the content. So you can make judgements (like in the case of data visualisations and presentations), even if you do not know as much about the content. Focus on the relationship between evidence and conclusion.

Even if a presentation obscures, don't assume motives. Judge based on what you have evidence on. Conspiracies and malice are greatly overrated, accidents and incompetence are greatly underrated. Focus on the content, don't assume the character.

Turn analytical thinking into analytical design

"The purpose of information display is to assist reasoning about the content."

  • Show comparisons, contrasts, and differences.
  • Show causality, mechanism, explanation, systematic structure.
  • The world is multivariate and spacial, not flat. Show multivariate data (more than 1-2 variables).
  • Don't segregate by the mode of production. Completely integrate words, numbers, images, diagrams, whatever. It's all information.
  • Document the evidence. What is this? Who is involved? Who are the sponsors? What are the data sources? Show the measurement scales, point out relevant issues, demonstrate your credibility and show the data. No one is likely to look at it, but it points to integrity of material. It promotes honesty and responsibility. Beware of presenters who will not share.
  • Content matters most of all. The best way to improve your presentation is to get better data. Analytical presentations should stand or fall depending on the quality, relevance, and integrity of their content.

[[!Software segregates information]]

"Analytical presentations ultimately stand or fall depending on the quality, relevance, and integrity of their content."

The world profits by segregating information by the mode of production. Apps own data and you can only access your own data in particularly apps, operating system, pages, etc. But this is detrimental to information and to us. The user doesn't get to think about their documents or data, they have to think about all these applications. Helpful for business, yes, but unnatural and not beneficial for users. Software says you cannot draw, write, read, and share in the same tool. We have gotten used to working around it.

ASKING: How do you know that?

"Truth is truth. It cannot be overruled by any speciality, or anything you do, think, or believe."

  • How do I know that?
  • Can I recognise when I know less than I think? Confirmation bias is strong.
  • Then ask how do you know that?
  • How can anyone possibly know that? Perform thought experiments. Are there any research designs that could answer this question? The claim might not even be wrong, it could be impossible to prove.

Why research on humans is way more difficult than rocket science

Nature's mathematical laws apply to every particle everywhere, forever. Truth an exactitude are always present. There are certain universal models in biological systems, but this exactitude is not present everywhere. The researchers are humans with biases and they are researching humans who can act, think, connive, etc.

Small tilts throughout the process can completely change (or create) a finding.

Part 3: Smarter meetings mean shorter meetings

A.k.a Edward Tufte's powerpoint takedown.

"The biggest problem in communication is the illusion that it has taken place." George Bernard Shaw

  • Begin with a document and study hall.
  • Content and credibility
  • Think about your audience
  • Discuss after study hall
  • Practice your presentation
  • Show up early. Finish early.

"I hate the way people use slide presentations instead of thinking." - Steve Jobs

Begin with a document

"Think complex, speak simple." - Jeff Bezos

Apparently Bezos and Amazon were an early adopter of this method of starting meetings with a 6 page memo. An actual document, with actual sentences. When you have to write out ideas, it forces clarity and helps you think it through (see Write to think). Don't send it out in advance. Few (no one) read these in advance so make the reading a part of the meeting. Gathering together means total concentration on the content.

Physically people can read 2x as fast as they can talk. Practically it's even faster because they skim and jump around. Readers have control of the information, rather than being passive viewers of a presentation. They can jump to what is important to them. Where they might have interrupted with a question, they can see them answered later in the document.

"It's no what you say, it's what they hear." - Red Auerbach

Preparing your presentation

Think about your content and credibility

  • Provide sources and data
  • Have quotes from experts in field. If you really want to go deep, find a quote from an opposite view. Explain why you did not go that way.

Think about your audience.

  • Be glad they showed up!
  • Do not be patronising. Do not underestimate your audience.

Practice!

  • Don't just do it on your own. Drag in a colleague.
  • Rehearse with people who give honest feedback.
  • Record your presentation and watch it back. It's horrifying, but effective.
  • After watching, try turning the video off and just listen to the sound.
  • Remember: rehearsal is more nervewracking than the actual thing. This is the worst it will ever be and is designed to catch things that go wrong.

Starting and ending a meeting

  • Plan some questions to guide any discussion.
  • Show up early. It's a show of respect to the audience and/or organisers.
  • Finish early!

Diagrams and maps

Assessing the quality of data displays

  • A graphic should provide reasons to believe
  • Google maps encodes huge amount of data layers. Why doesn't your graphic? The raw data is also right there underneath it. And its viewed billions of times a day. Granted it is close to the world we live and recognise, which helps.

Cartography as data visualisation

"The world is three dimensional. Why are we living so in flatland when it comes to looking at data?"

  • Cartography is a huge data visualisation challenge.
  • Cartography (and so data visualsation) does not have to be flat. Can encode information in multiple dimensions.
  • Data visualisation have an editorial voice. Use it. What's a feature? What's important? What can fade into the background?

Data analysis when the truth matters

On the relationship between evidence and conclusions

  • Learning the truth is difficult. Falsifying is easy.
  • There is plenty of money sloshing around.
  • Dr. [[!Confirmation bia]] conducts the research. No one escapes.

Individual vs statistical lies

Take anonymous, statistical lives as seriously as identified, individual lives. Named lives are often preferred and appeal to us more than statistics. We optimise for our local and insiders instead of outsiders. The same with rescues versus prevention.

Tufte brought up this observation from Andrew Vickers: "A mistake in the operating room can threaten the life of one patient, a mistake in statistical analysis or interpretation can lead to hundreds of early deaths. So it is odd that, while we allow a doctor to conduct surgery only after years of training, we give software packages - SPSS, R, Python, MatLab, AI - for statistical analysis to almost anyone." I think it less a point on the software packages so much as how we are willing to latch on to individual issues (ex: a doctor's mistake killing a patient) to statistical issues (ex: fudging the numbers for the test on a new drug).

Food for thought: "What if we treated statistical lies the same way we treat 'named' lies?""

It's hard to fake reality: detecting fraud

Fraud can often be detected by carefully examining images and data graphics. Data visualisation reveals. It's hard to fake reality. A single fraudalant or incorrect research paper might not seem like a big deal but then other papers, drug applications, etc. can use that as evidence.

Redesigning the sentence

"The most powerful tool in poetry is the linebreak."

Originally paper (or rather, papyrus and animal skins) were expensive and hard to obtain so they wanted to fill the space: fit in as many words as possible. Now paper and especially computer space is cheap but we still follow that thinking.

Content-responsive typography

"Sentences do not exist by themselves, but have natural, inevitable, unavoidable interactions with their surrounding space. Sentences are not independent of their spatial context, and interactions can create meanings and harms."

  • Design for content responsive typography and layouts. Not just responsive typography.
  • Sentences have a spatial context. They can survive being jammed together, but that is not thriving.
  • Sentences should aim for meaning, not convenient production but inconvenient meaning.
  • Space can and should be content-responsive and actively contribute to meaning. Linebreaks are powerful. This is seen in poetry, maps, math, code, comics, etc. but the web seems to draw more from traditional, type-stuffed books for convenient production.
  • Visual spacing can differentiate and clarify sentences. Design to make meaning more consequential, memorable, recognisable, and retrievable.
  • Not complex sentences but rich sentences. They can convey great depths of meaning and information without being hard to read.

Self-awareness of what we see

"To see the ordinary so intensely that the ordinary becomes extraordinary, becoming so focused, so specific about something, that is becomes something other than what it ordinarily is. [...] Not seeing something different is not seeing anything at all."

Most medieval castles were made of wood. We think most were made of stone because of [[!Survivor bias]]. This bias affects everything.

See / Learn / Do

"The most dangerous phrase in the language is 'We've always done it this way'" - Grace Hopper

Ask:

  • How do you/I/they really know that?
  • How could you/I/they possibly ever know that?

Reason intensely:

  • About what you see
  • Verbs
  • Links
  • Mechanisms
  • Connections
  • Dynamics
  • About what things do, not what they are named
  • Across multiple time-horizons (then, now, forever)
  • Compare, model, choose, decide, compare again

Thinking eyes act

"Good ideas are a dime a dozen for a smart person, what distinguishes good from great is how an idea is executre -- how it becomes reality." - Craig Venter

To act is essential, it is the difference between consumer vs producer, anecdote vs evidence, etc.

  • Make something of seeing and reasoning. Discover, product construct.
  • Understand, explain, show.
  • Gather consequences beyond yourself
  • Do not rely on memory's imprecision
  • Identify and celebrate excellence, universal knowledge.

Misc

Tim Berner's Lee original proposal for the internet was a response to information being segregated by false hierarchies. Has the internet returned to that? Digital Garden response A web, not a hierarchy "The method of storage must not place its own restraints on the information" The medium is the message

An internet dungeon. Adventure through Tim Berner's Lee original diagram. The server start up of old one-esque observers trying to catalogue their research papers. Be the bug in the machine. Mouse in the maze.